23 research outputs found
Online Update of Safety Assurances Using Confidence-Based Predictions
Robots such as autonomous vehicles and assistive manipulators are
increasingly operating in dynamic environments and close physical proximity to
people. In such scenarios, the robot can leverage a human motion predictor to
predict their future states and plan safe and efficient trajectories. However,
no model is ever perfect -- when the observed human behavior deviates from the
model predictions, the robot might plan unsafe maneuvers. Recent works have
explored maintaining a confidence parameter in the human model to overcome this
challenge, wherein the predicted human actions are tempered online based on the
likelihood of the observed human action under the prediction model. This has
opened up a new research challenge, i.e., \textit{how to compute the future
human states online as the confidence parameter changes?} In this work, we
propose a Hamilton-Jacobi (HJ) reachability-based approach to overcome this
challenge. Treating the confidence parameter as a virtual state in the system,
we compute a parameter-conditioned forward reachable tube (FRT) that provides
the future human states as a function of the confidence parameter. Online, as
the confidence parameter changes, we can simply query the corresponding FRT,
and use it to update the robot plan. Computing parameter-conditioned FRT
corresponds to an (offline) high-dimensional reachability problem, which we
solve by leveraging recent advances in data-driven reachability analysis.
Overall, our framework enables online maintenance and updates of safety
assurances in human-robot interaction scenarios, even when the human prediction
model is incorrect. We demonstrate our approach in several safety-critical
autonomous driving scenarios, involving a state-of-the-art deep learning-based
prediction model.Comment: 7 pages, 3 figure
MBMF: Model-Based Priors for Model-Free Reinforcement Learning
Reinforcement Learning is divided in two main paradigms: model-free and
model-based. Each of these two paradigms has strengths and limitations, and has
been successfully applied to real world domains that are appropriate to its
corresponding strengths. In this paper, we present a new approach aimed at
bridging the gap between these two paradigms. We aim to take the best of the
two paradigms and combine them in an approach that is at the same time
data-efficient and cost-savvy. We do so by learning a probabilistic dynamics
model and leveraging it as a prior for the intertwined model-free optimization.
As a result, our approach can exploit the generality and structure of the
dynamics model, but is also capable of ignoring its inevitable inaccuracies, by
directly incorporating the evidence provided by the direct observation of the
cost. Preliminary results demonstrate that our approach outperforms purely
model-based and model-free approaches, as well as the approach of simply
switching from a model-based to a model-free setting.Comment: After we submitted the paper for consideration in CoRL 2017 we found
a paper published in the recent past with a similar method (see related work
for a discussion). Considering the similarities between the two papers, we
have decided to retract our paper from CoRL 201
Detecting and Mitigating System-Level Anomalies of Vision-Based Controllers
Autonomous systems, such as self-driving cars and drones, have made
significant strides in recent years by leveraging visual inputs and machine
learning for decision-making and control. Despite their impressive performance,
these vision-based controllers can make erroneous predictions when faced with
novel or out-of-distribution inputs. Such errors can cascade to catastrophic
system failures and compromise system safety. In this work, we introduce a
run-time anomaly monitor to detect and mitigate such closed-loop, system-level
failures. Specifically, we leverage a reachability-based framework to
stress-test the vision-based controller offline and mine its system-level
failures. This data is then used to train a classifier that is leveraged online
to flag inputs that might cause system breakdowns. The anomaly detector
highlights issues that transcend individual modules and pertain to the safety
of the overall system. We also design a fallback controller that robustly
handles these detected anomalies to preserve system safety. We validate the
proposed approach on an autonomous aircraft taxiing system that uses a
vision-based controller for taxiing. Our results show the efficacy of the
proposed approach in identifying and handling system-level anomalies,
outperforming methods such as prediction error-based detection, and ensembling,
thereby enhancing the overall safety and robustness of autonomous systems
Parameter-Conditioned Reachable Sets for Updating Safety Assurances Online
Hamilton-Jacobi (HJ) reachability analysis is a powerful tool for analyzing
the safety of autonomous systems. However, the provided safety assurances are
often predicated on the assumption that once deployed, the system or its
environment does not evolve. Online, however, an autonomous system might
experience changes in system dynamics, control authority, external
disturbances, and/or the surrounding environment, requiring updated safety
assurances. Rather than restarting the safety analysis from scratch, which can
be time-consuming and often intractable to perform online, we propose to
compute \textit{parameter-conditioned} reachable sets. Assuming expected system
and environment changes can be parameterized, we treat these parameters as
virtual states in the system and leverage recent advances in high-dimensional
reachability analysis to solve the corresponding reachability problem offline.
This results in a family of reachable sets that is parameterized by the
environment and system factors. Online, as these factors change, the system can
simply query the corresponding safety function from this family to ensure
system safety, enabling a real-time update of the safety assurances. Through
various simulation studies, we demonstrate the capability of our approach in
maintaining system safety despite the system and environment evolution